6 research outputs found

    Toward Data-Driven Discovery of Software Vulnerabilities

    Get PDF
    Over the years, Software Engineering, as a discipline, has recognized the potential for engineers to make mistakes and has incorporated processes to prevent such mistakes from becoming exploitable vulnerabilities. These processes span the spectrum from using unit/integration/fuzz testing, static/dynamic/hybrid analysis, and (automatic) patching to discover instances of vulnerabilities to leveraging data mining and machine learning to collect metrics that characterize attributes indicative of vulnerabilities. Among these processes, metrics have the potential to uncover systemic problems in the product, process, or people that could lead to vulnerabilities being introduced, rather than identifying specific instances of vulnerabilities. The insights from metrics can be used to support developers and managers in making decisions to improve the product, process, and/or people with the goal of engineering secure software. Despite empirical evidence of metrics\u27 association with historical software vulnerabilities, their adoption in the software development industry has been limited. The level of granularity at which the metrics are defined, the high false positive rate from models that use the metrics as explanatory variables, and, more importantly, the difficulty in deriving actionable intelligence from the metrics are often cited as factors that inhibit metrics\u27 adoption in practice. Our research vision is to assist software engineers in building secure software by providing a technique that generates scientific, interpretable, and actionable feedback on security as the software evolves. In this dissertation, we present our approach toward achieving this vision through (1) systematization of vulnerability discovery metrics literature, (2) unsupervised generation of metrics-informed security feedback, and (3) continuous developer-in-the-loop improvement of the feedback. We systematically reviewed the literature to enumerate metrics that have been proposed and/or evaluated to be indicative of vulnerabilities in software and to identify the validation criteria used to assess the decision-informing ability of these metrics. In addition to enumerating the metrics, we implemented a subset of these metrics as containerized microservices. We collected the metric values from six large open-source projects and assessed metrics\u27 generalizability across projects, application domains, and programming languages. We then used an unsupervised approach from literature to compute threshold values for each metric and assessed the thresholds\u27 ability to classify risk from historical vulnerabilities. We used the metrics\u27 values, thresholds, and interpretation to provide developers natural language feedback on security as they contributed changes and used a survey to assess their perception of the feedback. We initiated an open dialogue to gain an insight into their expectations from such feedback. In response to developer comments, we assessed the effectiveness of an existing vulnerability discovery approach—static analysis—and that of vulnerability discovery metrics in identifying risk from vulnerability contributing commits

    Are Intrusion Detection Studies Evaluated Consistently? A Systematic Literature Review

    Get PDF
    Cyberinfrastructure is increasingly becoming target of a wide spectrum of attacks from Denial of Service to large-scale defacement of the digital presence of an organization. Intrusion Detection System (IDSs) provide administrators a defensive edge over intruders lodging such malicious attacks. However, with the sheer number of different IDSs available, one has to objectively assess the capabilities of different IDSs to select an IDS that meets specific organizational requirements. A prerequisite to enable such an objective assessment is the implicit comparability of IDS literature. In this study, we review IDS literature to understand the implicit comparability of IDS literature from the perspective of metrics used in the empirical evaluation of the IDS. We identified 22 metrics commonly used in the empirical evaluation of IDS and constructed search terms to retrieve papers that mention the metric. We manually reviewed a sample of 495 papers and found 159 of them to be relevant. We then estimated the number of relevant papers in the entire set of papers retrieved from IEEE. We found that, in the evaluation of IDSs, multiple different metrics are used and the trade-off between metrics is rarely considered. In a retrospective analysis of the IDS literature, we found the the evaluation criteria has been improving over time, albeit marginally. The inconsistencies in the use of evaluation metrics may not enable direct comparison of one IDS to another

    Continuous data-driven software engineering – towards a research agenda

    Get PDF
    The rapid pace with which software needs to be built, together with the increasing need to evaluate changes for end users both quantitatively and qualitatively calls for novel software engineering approaches that focus on short release cycles, continuous deployment and delivery, experiment-driven feature development, feedback from users, and rapid tool-assisted feedback to developers. To realize these approaches there is a need for research and innovation with respect to automation and tooling, and furthermore for research into the organizational changes that support flexible data-driven decision-making in the development lifecycle. Most importantly, deep synergies are needed between software engineers, managers, and data scientists. This paper reports on the results of the joint 5th International Workshop on Rapid Continuous Software Engineering (RCoSE 2019) and the 1st International Workshop on Data-Driven Decisions, Experimentation and Evolution (DDrEE 2019), which focuses on the challenges and potential solutions in the area of continuous data-driven software engineering

    DeepTC-Enhancer: Improving the Readability of Automatically Generated Tests

    Get PDF
    Automated test case generation tools have been successfully proposed to reduce the amount of human and infrastructure resources required to write and run test cases. However, recent studies demonstrate that the readability of generated tests is very limited due to (i) uninformative identifiers and (ii) lack of proper documentation. Prior studies proposed techniques to improve test readability by either generating natural language summaries or meaningful methods names. While these approaches are shown to improve test readability, they are also affected by two limitations: (1) generated summaries are often perceived as too verbose and redundant by developers, and (2) readable tests require both proper method names but also meaningful identifiers (within-method readability). In this work, we combine template based methods and Deep Learning (DL) approaches to automatically generate test case scenarios (elicited from natural language patterns of test case statements) as well as to train DL models on path-based representations of source code to generate meaningful identifier names. Our approach, called DeepTC-Enhancer, recommends documentation and identifier names with the ultimate goal of enhancing readability of automatically generated test cases. An empirical evaluation with 36 external and internal developers shows that (1) DeepTC-Enhancer outperforms significantly the baseline approach for generating summaries and performs equally with the baseline approach for test case renaming, (2) the transformation proposed by DeepTC-Enhancer results in a significant increase in readability of automatically generated test cases, and (3) there is a significant difference in the feature preferences between external and internal developers.Accepted author manuscriptSoftware Engineerin
    corecore